Transcription Factor Binding Sites Prediction Based on Modified Nucleosomes
نویسندگان
چکیده
In computational methods, position weight matrices (PWMs) are commonly applied for transcription factor binding site (TFBS) prediction. Although these matrices are more accurate than simple consensus sequences to predict actual binding sites, they usually produce a large number of false positive (FP) predictions and so are impoverished sources of information. Several studies have employed additional sources of information such as sequence conservation or the vicinity to transcription start sites to distinguish true binding regions from random ones. Recently, the spatial distribution of modified nucleosomes has been shown to be associated with different promoter architectures. These aligned patterns can facilitate DNA accessibility for transcription factors. We hypothesize that using data from these aligned and periodic patterns can improve the performance of binding region prediction. In this study, we propose two effective features, "modified nucleosomes neighboring" and "modified nucleosomes occupancy", to decrease FP in binding site discovery. Based on these features, we designed a logistic regression classifier which estimates the probability of a region as a TFBS. Our model learned each feature based on Sp1 binding sites on Chromosome 1 and was tested on the other chromosomes in human CD4+T cells. In this work, we investigated 21 histone modifications and found that only 8 out of 21 marks are strongly correlated with transcription factor binding regions. To prove that these features are not specific to Sp1, we combined the logistic regression classifier with the PWM, and created a new model to search TFBSs on the genome. We tested the model using transcription factors MAZ, PU.1 and ELF1 and compared the results to those using only the PWM. The results show that our model can predict Transcription factor binding regions more successfully. The relative simplicity of the model and capability of integrating other features make it a superior method for TFBS prediction.
منابع مشابه
Prediction of Nucleosome Positioning Based on Transcription Factor Binding Sites
BACKGROUND The DNA of all eukaryotic organisms is packaged into nucleosomes, the basic repeating units of chromatin. The nucleosome consists of a histone octamer around which a DNA core is wrapped and the linker histone H1, which is associated with linker DNA. By altering the accessibility of DNA sequences, the nucleosome has profound effects on all DNA-dependent processes. Understanding the fa...
متن کاملTranscription Factor Binding Site Positioning in Yeast: Proximal Promoter Motifs Characterize TATA-Less Promoters
The availability of sequence specificities for a substantial fraction of yeast's transcription factors and comparative genomic algorithms for binding site prediction has made it possible to comprehensively annotate transcription factor binding sites genome-wide. Here we use such a genome-wide annotation for comprehensively studying promoter architecture in yeast, focusing on the distribution of...
متن کاملA Novel Transcription Factor Binding Sites Prediction Approach
Transcription factors (TFs) and their DNA binding motifs, called transcription factor binding sites (TFBSs) play important roles in most biological processes. However, the list for TFBSs still remains largely unknown. Machine learning approaches have been intensively applied to predict TFBSs. In this paper, a novel prediction approach has been presented based on Markov Chain Monte Carlo (MCMC) ...
متن کاملBlurring of High-Resolution Data Shows that the Effect of Intrinsic Nucleosome Occupancy on Transcription Factor Binding is Mostly Regional, Not Local
Genome wide maps of nucleosome occupancy in yeast have recently been produced through deep sequencing of nuclease-protected DNA. These maps have been obtained from both crosslinked and uncrosslinked chromatin in vivo, and from chromatin assembled from genomic DNA and nucleosomes in vitro. Here, we analyze these maps in combination with existing ChIP-chip data, and with new ChIP-qPCR experiments...
متن کاملTraining-free atomistic prediction of nucleosome occupancy.
Nucleosomes alter gene expression by preventing transcription factors from occupying binding sites along DNA. DNA methylation can affect nucleosome positioning and so alter gene expression epigenetically (without changing DNA sequence). Conventional methods to predict nucleosome occupancy are trained on observed DNA sequence patterns or known DNA oligonucleotide structures. They are statistical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 9 شماره
صفحات -
تاریخ انتشار 2014